How to Screen a Data Stream - Quality-Driven Load Shedding in Sensor Data Streams

نویسندگان

  • Anja Klein
  • Gregor Hackenbroich
  • Wolfgang Lehner
چکیده

As most data stream sources exhibit bursty data rates, data stream management systems must recurrently cope with load spikes that exceed the average workload to a considerable degree. To guarantee low-latency processing results, load has to be shed from the stream, when data rates overstress system resources. There exist numerous load shedding strategies to delete excess data. However, the consequent data loss leads to incomplete and/or inaccurate results during the ongoing stream processing. In this paper, we present a novel quality-driven load shedding approach that screens the data stream to find and discard data items of minor quality. The data quality of stream processing results is maximized under the adverse condition of data overload. After an introduction to data quality management in data streams, we define three data quality-driven load shedding algorithms, which minimize the approximation error of aggregations and maximize the completeness of join processing results, respectively. Finally, we demonstrate their superiority over existing load shedding techniques at real-life weather data.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Loadstar: A Load Shedding Scheme for Classifying Data Streams

We consider the problem of resource allocation in mining multiple data streams. Due to the large volume and the high speed of streaming data, mining algorithms must cope with the effects of system overload. How to realize maximum mining benefits under resource constraints becomes a challenging task. In this paper, we propose a load shedding scheme for classifying multiple data streams. We focus...

متن کامل

QoS-Driven Load Shedding on Data Streams

In this thesis, we are working on the optimized execution of very large number of continuous queries de ned on data streams. Our scope includes both classical query optimization issues adapted to the stream data environment as well as analysis and resolution of overload situations by intelligently discarding data based on applicationdependent quality of service (QoS) information. This paper ser...

متن کامل

Loadstar: Load Shedding in Data Stream Mining

In this demo, we show that intelligent load shedding is essential in achieving optimum results in mining data streams under various resource constraints. The Loadstar system introduces load shedding techniques to classifying multiple data streams of large volume and high speed. Loadstar uses a novel metric known as the quality of decision (QoD) to measure the level of uncertainty in classificat...

متن کامل

Content-based Load Shedding in Multimedia Data Stream Management System

Overload management has become very important in public safety systems that analyse high performance multimedia data streams, especially in the case of detection of terrorist and criminal dangers. Efficient overload management improves the accuracy of automatic identification of persons suspected of terrorist or criminal activity without requiring interaction with them. We argue that in order t...

متن کامل

A Quality-Centric Data Model for Distributed Stream Management Systems

It is challenging for large-scale stream management systems to return always perfect results when processing data streams originating from distributed sources. Data sources and intermediate processing nodes may fail during the lifetime of a stream query. In addition, individual nodes may become overloaded due to processing demands. In practice, users have to accept incomplete or inaccurate quer...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2009